Sparse Principal Component Analysis via Regularized Low Rank Matrix Approximation

نویسندگان

  • Haipeng Shen
  • Jianhua Z. Huang
چکیده

Principal component analysis (PCA) is a widely used tool for data analysis and dimension reduction in applications throughout science and engineering. However, the principal components (PCs) can sometimes be difficult to interpret, because they are linear combinations of all the original variables. To facilitate interpretation, sparse PCA produces modified PCs with sparse loadings, i.e. loadings with very few non-zero elements. In this paper, we propose a new sparse PCA method, namely sparse PCA via regularized SVD (sPCA-rSVD). We use the connection of PCA with singular value decomposition (SVD) of the data matrix and extract the PCs through solving a low rank matrix approximation problem. Regularization penalties are introduced to the corresponding minimization problem to promote sparsity in PC loadings. An efficient iterative algorithm is proposed for computation. Two tuning parameter selection methods are discussed. Some theoretical results are established to justify the use of sPCA-rSVD when only the data covariance matrix is available. In addition, we give a modified definition of variance explained by the sparse PCs. The sPCA-rSVD provides a uniform treatment of both classical multivariate data and High-Dimension-Low-Sample-Size data. Further understanding of sPCA-rSVD and some existing alternatives is gained through simulation studies and real data examples, which suggests that sPCA-rSVD provides competitive results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Value function approximation via low-rank models

We propose a novel value function approximation technique for Markov decision processes. We consider the problem of compactly representing the state-action value function using a low-rank and sparse matrix model. The problem is to decompose a matrix that encodes the true value function into low-rank and sparse components, and we achieve this using Robust Principal Component Analysis (PCA). Unde...

متن کامل

Fast Automatic Background Extraction via Robust PCA

Recent years have seen an explosion of interest in applications of sparse signal recovery and low rank matrix completion, due in part to the compelling use of the nuclear norm as a convex proxy for matrix rank. In some cases, minimizing the nuclear norm is equivalent to minimizing the rank of a matrix, and can lead to exact recovery of the underlying rank structure, see [Faz02, RFP10] for backg...

متن کامل

CS 267 Final Project: Parallel Robust PCA

Principal Component Analysis (PCA; Pearson, 1901) is a widely used method for data compression. The goal is to find the best low rank approximation of a given matrix, as judged by minimization of the `2 norm of the difference between the original matrix and the low rank approximation. However, the classical method is not resistant to corruption of individual input data points. Recently, a robus...

متن کامل

SAR Target Recognition via Local Sparse Representation of Multi-Manifold Regularized Low-Rank Approximation

The extraction of a valuable set of features and the design of a discriminative classifier are crucial for target recognition in SAR image. Although various features and classifiers have been proposed over the years, target recognition under extended operating conditions (EOCs) is still a challenging problem, e.g., target with configuration variation, different capture orientations, and articul...

متن کامل

Non-Convex Rank Minimization via an Empirical Bayesian Approach

In many applications that require matrix solutions of minimal rank, the underlying cost function is non-convex leading to an intractable, NP-hard optimization problem. Consequently, the convex nuclear norm is frequently used as a surrogate penalty term for matrix rank. The problem is that in many practical scenarios there is no longer any guarantee that we can correctly estimate generative low-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007